Classification of Printed and Handwritten Text: a Review
نویسندگان
چکیده
Separating handwritten and machine printed text from a document has many applications. Various types of documents like bank cheques and forms etc. are used in daily life which contains both handwritten as well as printed text. It is necessary to separate handwritten and machine printed text before processing it with optical character recognition system. Various strategies are used to discriminate between handwritten and printed text. Many methods are specific to a particular script and others can handle different type of scripts. The various steps to carry out this discrimination are performed in this sequence: scanning, pre-processing, feature extraction and classification. In this paper, an effort has been made to review the various techniques for discriminating handwritten and machine printed text.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملNeural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملShape codebook based handwritten and machine printed text zone extraction
In this paper, we present a novel method for extracting handwritten and printed text zones from noisy document images with mixed content. We use Triple-Adjacent-Segment (TAS) based features which encode local shape characteristics of text in a consistent manner. We first construct two codebooks of the shape features extracted from a set of handwritten and printed text documents respectively. We...
متن کاملA Survey on Arabic Character Recognition
Off-line recognition of text play a significant role in several application such as the automatic sorting of postal mail or editing old documents. It is the ability of the computer to distinguish characters and words. Automatic off-line recognition of text can be divided into the recognition of printed and handwritten characters. Off-line Arabic handwriting recognition still faces great challen...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017